Compile - Time Minimisation of Load Imbalance in Loop

نویسندگان

  • Rizos Sakellariou
  • John R. Gurd
چکیده

Parallelising compilers typically need some performance estimation capability in order to evaluate the trade-oos between diierent transformations. Such a capability requires sophisticated techniques for analysing the program and providing quantitative estimates to the compiler's internal cost model. Making use of techniques for symbolic evaluation of the number of iterations in a loop, this paper describes a novel compile-time scheme for partitioning loop nests in such a way that load imbalance is minimised. The scheme is based on a property of the class of canonical loop nests, namely that, upon partitioning into essentially equal-sized partitions along the index of the outermost loop, these can be combined in such a way as to achieve a balanced distribution of the computational load in the loop nest as-a-whole. A technique for handling non-canonical loop nests is also presented; essentially, this makes it possible to create a load-balanced partition for any loop nest which consists of loops whose bounds are linear functions of the loop indices. Experimental results on a virtual shared memory parallel computer demonstrate that the proposed scheme can achieve better performance than other compile-time schemes. 1 Introduction In order to evaluate the performance trade-oos of diierent transformations, parallelising compilers are usually armed with some performance estimation capability; this issue has been addressed recently by a number of researchers 3, 7, 19]. Although the implementation details of these schemes vary, generally they attempt to identify sources of performance loss, such as load imbalance, interprocessor communication, cache misses, etc. 4, 6]. This has two implications for a parallelising compiler. Firstly, the compiler must be capable of extracting quantitative information from programs | since parallelising compilers usually target the parallelisa-tion of loop nests, signiicant information lies in the number of times each loop will be executed; this can be used, for instance, to estimate the amount of work assigned to each processor, or the number of non-local accesses to data 7].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Compile-Time Partitioning Strategy for Non-Rectangular Loop Nests

This paper presents a compile-time scheme for partitioning non-rectangular loop nests which consist of inner loops whose bounds depend on the index of the outermost, parallel loop. The minimisation of load imbalance, on the basis of symbolic cost estimates, is considered the main objective; however, options which may increase other sources of overhead are avoided. Experimental results on a virt...

متن کامل

Compile-Time Partitioning of Three-Dimensional Iteration Spaces

This paper presents a strategy for compile-time partitioning of generalised three-dimensional iteration spaces; it can be applied to loop nests comprising two inner nested loops both of which have bounds linearly dependent on the index of the outermost parallel loop. The strategy is analysed using symbolic analysis techniques for enumerating loop iterations which can provide estimates for the l...

متن کامل

Voltage Imbalance Compensation for Droop-Controlled Inverters in Islanded Microgrid

In this paper, a new control strategy is proposed for implementation in low-voltage microgrids with balanced/ unbalanced load circumstances. The proposed scheme contains, the power droop controllers, inner voltage and current loops, the virtual impedance loop, the voltage imbalance compensation. The proposed strategy balances the voltage of the single-phase critical loads by compensating the im...

متن کامل

A New Adaptive Load-Shedding and Restoration Strategy for Autonomous Operation of Microgrids: A Real-Time Study

Islanding operation is one of the main features of a MicroGrid (MG), which is realized regarding the presence of distributed energy resources (DERs). However, in order to deal with the control challenges, which an MG faces during island operation, particularly when the transition is associated with certain excessive load, an efficient control strategy is required. This paper introduces a Centra...

متن کامل

New Static Scheduling and Elastic Load Balancing Methods for Parallel Query Processing

This paper presents a compile-time optimization methodology for complex relational query processing on a multiprocessor machine. A new scheduling algorithm is proposed to allocate the resources of the machine. A control mechanism traces the query processing and a special hierarchy of supervisors is introduced to interfere in case of load imbalance. Dynamic load balancing is then achieved using ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997